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Letter from the Editor 


a 
Welcome to the Spring 2019 volume of The AIR Professional File. This volume presents j 
two “how-to” articles designed to guide readers through implementation processes for 


assessment and data management. 


Association for 
Institutional Research 


Have you ever wished that your institution could have a dedicated assessment day to collect 
information on student learning? Have you ever wondered what it would take to pull it off? 
Our colleagues from James Madison University have been doing it for more than 30 years. 

In University-wide Assessment Days: The James Madison University Model, they share what 
they have learned about the logistics and challenges of implementing university assessment 
days. Their blueprint for success may inspire you to adopt some of their strategies for your 
own institution. 


Facing a different kind of situation, our colleagues from University of Western States describe 
how they took their institution from Data Crisis to Data-Centric by eliminating the data 

silos and shadow systems that engendered mistrust and replacing them with an integrated 
data management system commended by regional accreditors. They provide readers with 
detailed guidance on issues of data governance, personnel, and systems. Their remarkable 
turnaround is impressive! 


These “how-to” articles illustrate ways in which a single institution can serve as a model for 
others that can benefit from these experiences. Consider sharing your own experiences with 
your data colleagues through the AIR Professional File. 


Sincerely, 
Sharron Ronco 
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Abstract 

James Madison University has used 
dedicated Assessment Days for more 
than 30 years to collect longitudinal 
data on student learning outcomes. 
Our model ensures all incoming 
students are tested twice: once before 
beginning classes and again after 
accumulating 45-70 credit hours. 
Although each student completes 
only four instruments during a 2-hour 
testing period, 25 different assessments 
are administered, thereby allowing for 
the examination of student growth 

on a variety of different outcomes. 

This article describes our model and 
outlines the logistics involved in 
planning for Assessment Day, including 
the physical and human resources 
needed for its success. We also address 
changes we have made over the years 
and the challenges we continue to 
encounter. Our intention is to share 
lessons learned and encourage readers 
to consider how our model might 


be adapted for the assessment of 
programs both large and small at their 
own institutions. 


Keywords: Assessment Days, large- 
scale assessment, general education 
assessment, data collection designs 


Background 

Every campus has wide-reaching 
programs intended to affect the 
learning and development of all or 
most students. Examples include 
general education, large-scale student 
affairs programs, and campus-wide 
initiatives. Given the large number of 
students served by these programs 
and the importance of their associated 
outcomes, the effectiveness of these 
programs is often of great interest 

to many stakeholders. Assessment 
data are therefore collected to reveal 
the strengths and weaknesses of 
these wide-reaching programs, 

and to partially fulfill requirements 

of accrediting bodies and funding 
agencies. Procuring assessment 

data that help universities improve 
student learning and demonstrate 
accountability, however, is no trivial 
task. To acquire meaningful information, 
colleges must carefully consider the 
data collection design along with the 
numerous other details inherent in 
conducting quality research. 


The purpose of this article is to 
describe the approach James Madison 
University (JMU) has used for more 
than 30 years to collect assessment 
data for its university-wide programs. 
Like other universities, we use 
dedicated Assessment Days (Swing, 
2001). Our Assessment Day approach 
enables the university to collect 
longitudinal data on student learning 
and developmental outcomes by 
setting aside 2 days per year dedicated 
to assessment. All incoming first-year 
students (excluding transfer students) 
are required to participate in Fall 
Assessment Day (N = 4,000 students); 
all students with 45-70 credit hours 
(typically sophomores and including 
transfer students) are required to 
participate in Spring Assessment 

Day (N = 4,000 students). During 
Spring Assessment Day students are 
administered the same instruments 
they were administered during Fall 
Assessment Day (18 months prior), 
thereby creating a pretest-posttest 
design that permits evaluation of gains 
in student learning and development. 


Before describing Assessment Day 
logistics and resources, it is important 
to explain the two primary reasons why 
we've used this model for more than 

30 years. First, our Assessment Day 
model addresses major weaknesses 
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associated with common assessment 
approaches, specifically those using a 
posttest-only design, cross-sectional 
data, or convenience samples. Second, 
we continue to use the Assessment 
Day model because it allows many 
questions about student learning and 
development to be addressed. We 
provide several examples below to 
convey the methodological advantages 
of our approach, the kinds of questions 
that can be addressed, and how the 
results are used. 


One of the greatest strengths of 

our Assessment Day model is the 
assessment of all incoming first-year 
students the week before classes begin. 
Results from Fall Assessment Days are 
used to explore the appropriateness 
of allowing course credit for various 
precollege experiences, as illustrated 
with the results in Table 1 for the 
American Experience assessment, 
which is used to assess our American 
History and Political Science 
requirement. The similar performance 
of incoming first-year students with 
and without dual-enrollment transfer 
credit on this and many of our other 
assessments has fueled a continuous 
debate at our university as to whether 
dual-enrollment credit should be 
permitted. 


Most importantly, Fall Assessment 
Day results allow for a richer and 
more-nuanced interpretation of 
Spring Assessment Day results. 

To illustrate, Table 2 provides the 
percentage of students meeting the 
faculty-set standard on a quantitative 
and scientific reasoning assessment 
at pretest (Fall Assessment Day) 

and at posttest (Spring Assessment 
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Table 1. Number Correct Mean and Standard Deviation on the 40-item American 
Experience Assessment for Incoming First-Year Students (N = 925) in 2017 by Type of 
Course Credit 


Advanced Placement 57 29.4 5.7 
Dual Enrollment 71 21.6 SI 
None 797 21.8 6.1 


Table 2. Percentage of Students Meeting Standard on Quantitative and Scientific 
Reasoning Assessment on Fall and Spring Assessment Days for Two Cohorts 


367 2015 21% 


2017 46% 


412 2016 28% 


2018 39% 


Day). Because a larger percentage of 
students met the standard at posttest 
than at pretest, we can conclude that 
students are gaining in knowledge 
over time. If we had only posttest data, 
it could be argued that the posttest 
results reflect nothing more than the 
knowledge students had upon arriving 
at the university. Thus, Fall Assessment 
Day results allow us to explore— 

and often rule out—a plausible and 
competing alternative hypothesis for 
the posttest findings. 


By having each student complete the 
same assessment twice during the first 
18 months of their college career, we 
are also able to provide evidence of 
student learning. To illustrate, effect 
sizes capturing the number of standard 
deviation units by which average scores 
change from Fall to Spring Assessment 
Day are provided in Table 3 for 
assessments administered to incoming 


first-year students in 2014. The effect 
sizes are positive, which indicates that 
the college experience adds value. The 
fact that some of the effect sizes are not 
as large as we would like them to be is 
a call to action. For example, when the 
quantitative and scientific reasoning 
test results indicated that students 
who had completed their requirement 
were still struggling to discriminate 
between correlation and causation, 
the program director organized a 
series of faculty meetings to identify 
student misconceptions and design 
learning strategies to implement new 
pedagogies. 


Given that the credit window for 
Spring Assessment Day captures 
students at various stages of general 
education completion, our pretest- 
posttest design also allows change 
over time to be explored for different 
subsets of students (Pieper, Fulcher, 


Table 3. Effect Sizes for Six Assessments for Students Tested on Fall Assessment Day, 2014, and Spring Assessment Day, 2016 


Acronym Test Name Content Area N d 

NW9 Natural World—version 9 Quantitative & scientific reasoning 194 0.53 
GLEX2 The Global Experience—version 2 Global history & issues 243 OFS 
AMEX3 The American Experience—version 3 American history & political science 246 0.33 
ISNW-A1 Date Oe ae ee aa Environmental stewardship 413 0.40 
KWH8 Knowledge of Wellness and Health—version 7 Wellness & health 253 1.33 
SDA-7 Sociocultural Domain Assessment—version 7 Sociocultural understanding 295 0.77 


Note. Effect sizes (d) were calculated by subtracting the Fall 2014 average score from the Spring 2016 average score and dividing by the Fall 2014 
standard deviation. The d values can be interpreted as the number of standard deviation units by which the Spring 2016 average differs from the Fall 
2014 average. With the exception of the ISNW-A1, results are based on only those students who had completed their content area requirement through 
coursework at our university by Spring 2016. Because there is no such requirement in environmental stewardship, results for the ISNW-A1 are based on 
all students who completed the test in both fall and spring. 


Sundre, & Erwin, 2008). For instance, 
students who have yet to take any 
courses in a general education 
program are compared to students 
who have partially completed or fully 
completed the program (as shown 
for our American History and Political 
Science requirement in Table 4.' (See 
also Hathcoat, Sundre, & Johnston, 
2015, Tables 6 and 7.) Furthermore, 
we consider score differences among 
students who have completed their 
requirements elsewhere (e.g., transfer 
credits, Advanced Placement credits), 
allowing us to explore the impact of 
non-JMU coursework.” 


Because of the advantages of our 
Assessment Day model, we continue 
to use it year after year. Of course, the 
current design looks quite different 
from how it looked 30 years ago. In 
response to challenges encountered 
along the way, many modifications 
have been made—and continue to 

be made—to our Assessment Day 
model. In the sections below, we build 
on the work of Grays and Sundre 
(2012) by describing our model and 
sharing what we have learned from its 
implementation. Specifically, we detail 
the logistics involved, highlighting 
physical materials and communication 


strategies. We also describe the 
logistics team and its responsibilities 
before, during, and after Assessment 
Day. Furthermore, we describe the 
important role that proctors play on 
Assessment Day and the process we 
use for their hiring and training. The 
paper concludes with discussion of 
changes we have made to Assessment 
Day and the challenges we continue to 
encounter. 


'The results in Table 4 are typical of the kind of results we see on many of our assessments. We often see gains in knowledge over time, but not of the 
magnitude we would like. As well, increased coursework in the domain is often not strongly related to pretest-posttest gains. Faculty reactions and 
explanations for such results are provided in Mathers, Finney, and Hathcoat (2018). 


? Of course, because students were not assigned randomly to these different experiences we cannot claim that different kinds or amounts of coursework 
cause these score changes. To strengthen the causal link between assessment results and experiences we’ve used alternative analytical techniques 

(e.g., propensity score analysis; Harris & Horst, 2016) and implementation fidelity studies, which consider the extent to which programs are delivered as 
intended (Fisher, Smith, Finney, & Pinder, 2014; Gerstner & Finney, 2013; Swain, Finney, & Gerstner, 2013). 
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Table 4. Number Correct Mean (and Standard Deviation) on the 40-item American Experience Assessment on Fall and Spring 
Assessment Days by Course Completion Status 


N Fall Assessment Day Spring Assessment Day 
(Pretest) 2016 (Posttest) 2018 
JMU course completed 
American History 150 Seles) 25831 (O10) 
Political Science 71 24.1 (5.6) 25.0 (5.4) 
JMU course not completed 
Not currently enrolled in American History/Political Science 85 22.7 (5.9) 23.1 (5.7) 
course 
Currently enrolled in American History/Political Science course 52 21.6 (5.9) 23.7 (5.4) 


Note. These results are typical of the kind of results we see on many of our assessments. We often see gains in knowledge over time, but not of the 
magnitude we would like. As well, increased coursework in the domain is often not strongly related to pretest-posttest gains. Faculty reactions and 
explanations for such results are provided in Mathers et al. (2018). 


THE JMU ASSESSMENT 
DAY MODEL 


Between 3,800 and 4,800 students are 
required to attend each Assessment 
Day, with incoming first-year students 
(excluding transfer students) tested 
during Fall Assessment Day and 
students with 45-70 credit hours 
tested during Spring Assessment Day. 
Rather than relying on volunteers or 
convenience samples, JMU requires 
all qualifying students to participate 
in Assessment Days. This helps us 
represent students who have taken 
different academic paths and ensures 
that our results are fully reflective of 
the JMU experience. If a student is 
required to participate and fails to do 
so, a hold is placed on their record, 
prohibiting modifications to their 
current schedule and future course 
registration. This policy not only 
demonstrates to students and other 
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stakeholders JMU's strong commitment 
to quality assessment, but also ensures 
participation. Fortunately, attendance 
is high with the 5-year attendance rate 
on Fall and Spring Assessment Days 
being 94% and 90%, respectively. 


The current Assessment Day structure 
includes three 2-hour testing sessions, 
with the sessions each separated by 
about an hour. During each session, 
one third of the required students 
(1,200-1,500 students) are tested. To 
accommodate this number of students 
in a single session, about 25 different 
rooms are used, with each room 
seating between 30 and 170 students. 
Almost all rooms are located within a 
single building, which allows our team 
to be on hand to address any issues. 
Testing rooms are reserved more than 
a year in advance and include large 
lecture halls, small classrooms, and 
computer labs. To illustrate, the rooms 


used during Spring 2016 are listed in 
Table 5. 


In the fall, commandeering almost 

an entire building is not an issue 
because Assessment Day takes place 
the Friday before classes begin. Spring 
Assessment Day, however, takes place 
on a Tuesday in mid-February; to 
avoid scheduling conflicts, all classes 
are cancelled until 4:00pm. This 

not only frees space on campus for 
university-wide assessment, but also 
allows students who are not required 
to participate in Assessment Day to 
participate in academic program 
assessment. 


As many as 25 different assessments 
are administered on Assessment Day; 
each student completes no more than 
four assessments during their 2-hour 
testing session (see Table 6). Thus, 
large random samples of students 
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Table 6. Assessments Administered in Spring 2016 and Sample Size 


Acronym N Name Content Area 

AHQ 510 Arts & Humanities Questionnaire Arts & humanities 

AHQ2 528 Arts & Humanities Questionnaire, version 2 Arts & humanities 

AMEX3 960 The American Experience, version 3 American history & political science 
CAT 165 Critical-thinking Assessment Test Critical thinking 

ERIT-XA 534 Ethical Reasoning Identification Test, version XA Ethical reasoning 

ERRT 234 Ethical Reasoning Recall Test Ethical reasoning 

ER-WRA 180 Ethical Reasoning, Writing, version A Ethical reasoning 

GLEX2 585 The Global Experience, version 2 Global history & issues 
INFOCORE 183 Information Literacy Core Information literacy 

ISNW-A1 528 Institute for Stewardship of the Natural World, version A1 Environmental stewardship 
KWH8 585 Knowledge of Wellness and Health, version 8 Wellness & health 

MFLS 165 Meaningful Life Survey Purpose & meaning in life 

Nw9 510 Natural World, version 9 Quantitative & scientific reasoning 
NW9X 960 Natural World Short Form, version 9 Quantitative & scientific reasoning 
OCP2 486 Oral Communications Pretest, version 2 Oral communication 

SD-1 3282 Student Development, version 1 Student development 

SD-3 1065 Student Development, version 3 Student development 

SDA-7 534 Sociocultural Domain Assessment, version 7 Sociocultural understanding 
SOS-2 4437 Student Opinion Survey, version 2 Examinee motivation 

STPA2 201 Sociocultural Thought Process Assessment, version 2 Sociocultural reasoning 


Note. Seventy percent of the assessments listed here are direct measures of student learning (as opposed to self-report measures of learning or self- 
report measures of attitudes, feelings, or behaviors). With the exception of the CAT, the direct measures listed here were created by faculty at the 


university. 
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complete each assessment, but no 
student completes all assessments. 
Assessing every student on all 
outcomes is not necessary because 

the data are not used for individual 
assessment purposes. The vast majority 
of assessments are used for program 
assessment purposes and are direct 
measures of student learning.? New and 
revised assessments are also piloted 
and evaluated for future use. This is 
particularly important because many 
of our assessments are developed by 
our own faculty and staff to maximize 
the alignment between program 
outcomes and instruments. Because 
the responsibility for the psychometric 
evaluation of these assessments falls on 
us, a small proportion of Assessment 
Day data is devoted to this purpose. 


Data are also collected for the 
psychometric evaluation of 
instruments developed outside of 
JMU. Importantly, validity studies are 
conducted to ensure instruments are 
appropriate for use with our student 
population and for the purposes of 
program assessment. Examples of 
how Assessment Day data have been 
used in psychometric evaluations are 
provided by Brown, Finney, and France 
(2011), Cameron, Wise, and Lottridge 
(2007), Kopp, Zinn, Finney, and Jurich 
(2011), France, Finney, and Swerdzewski 
(2010), Johnston and Finney (2010), 


Smiley and Anderson (2011), and Taylor 
and Pastor (2007). 


Planning for Assessment Day 
Planning for each Assessment Day 
begins months in advance with the 
creation of a spreadsheet known as 
the master plan that details which 
assessments and student identification 
numbers are assigned to the various 
rooms and sessions (see Table 5). In the 
section below, we describe how and 
when these decisions are made, and 
from whom we gather the necessary 
information. 


One of the first tasks involved in 
planning for a Fall Assessment Day 

is deciding which assessments to 
administer.4 Four months prior to 

Fall Assessment Day, assessment 
coordinators for general education 
programs and university-wide 
initiatives are asked to provide 
information about the measure(s) 
that their university-wide program 
wishes to administer. We ask for 

the length of time it will take to 
complete the instrument(s), whether 
computer-based or paper-and-pencil 
administration is preferred, and the 
desired sample size. We then create 
test configurations based on this 
information (i.e., sets of three to four 
measures that can be given together 
and require slightly less than a total of 2 
hours to complete). 


Once the configurations are 
determined, we assign configurations 
to each testing room. In each room the 
same test configuration is used across 
each of the three testing sessions for 
two reasons. First, because proctors 
remain in the same room across 
sessions, keeping the test configuration 
consistent helps to avoid proctor 
confusion. Second, in paper-and-pencil 
testing rooms students provide their 
responses on Scantrons (i.e., optical 
answer sheets); as such, the paper 
copies of the tests remain unmarked 
and the same paper copies of tests 

can be reused across sessions. This 
helps keep the number of printed test 
copies to a minimum, which helps 
reduce costs and keep Assessment Day 
environmentally friendly. 


The final step is to assign students 

to rooms and sessions based on 

the last three digits of their student 
identification numbers, as shown in the 
last three columns of Table 5.° Because 
the last several digits of identification 
numbers are used to assign students to 
rooms, the sample of students assigned 
to each room, and subsequently to 
each test, is random. 


The above description characterizes the 
planning involved for Fall Assessment 
Days. When developing a master plan 
for Spring Assessment Days, we use 

the plan previously configured for 


3A direct measure of student learning tests a student's knowledge and skills. For example, rather than asking students to self-report 
whether they are skilled in information literacy, we use a knowledge test to evaluate whether students are skilled in information literacy. 


* Every general education program and university-wide initiative is assessed on every Assessment Day. If there is any concern about 
whether a program should be assessed, guidance is obtained from the university’s Assessment Advisory Council, which is a team of ad- 
ministrators, faculty, and staff whose purpose is to provide guidance on these very issues. 
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the same cohort of students for their 
Fall Assessment Day, modifying as 
necessary. This helps ensure Spring 
Assessment Day students are assigned 
to complete the same measures as 
when they were incoming first-year 
students. 


Human Resources 

Substantial human resources are 
needed to orchestrate each Assessment 
Day. In this section, we describe two 
essential groups: the Assessment Day 
team that works year-round on the 
planning, coordination, and execution 
of each Assessment Day; and the 
Assessment Day proctors. 


The Assessment Day Team 

The Assessment Day team is a 
subgroup of the Center for Assessment 
and Research Studies (CARS), which 

is the unit on campus responsible 

for providing guidance regarding 

the assessment of student learning 
and developmental outcomes.® The 
Assessment Day team is responsible 
for planning and coordinating both 
Assessment Days, as well as for the 
associated data management that 
occurs afterward. It consists of a 
faculty lead, three graduate assistants 


(GAs), and an administrative assistant. 
Additionally, the team relies heavily 
on the CARS's information security 
analyst, fiscal technician, and three 
undergraduate work-study students 
to assist in tasks crucial to a successful 
Assessment Day (e.g., storing data 
securely, processing paperwork for 
paying proctors, packing and double- 
checking materials). 


No member of the Assessment Day 
team devotes their entire work week 
year-round to Assessment Day. The 
current faculty lead of Assessment 
Day devotes 8-10 hours per week, on 
average. During the fall and spring 
semesters one GA on the team has 20 
hours per week assigned to Assessment 
Day, and the remaining two GAs have 
10 hours per week. The work-study 
students assist during the fall and 
spring semesters, with each of the 
students spending about 8 hours per 
week on Assessment Day tasks during 
the busiest times of the year. 


The work associated with Assessment 
Day is not constant throughout the 
year; it is heaviest the 2 months before 
and after each Assessment Day. Each 
member of the team has different 


responsibilities prior to, during, and 
after Assessment Day, which are 
described below. The tasks typically 
completed by the work-study students 
during these times are also provided. 


Prior to Assessment Day 

Many of the tasks completed prior to 
Assessment Day were detailed above 
in the planning section. Examples 
include soliciting and organizing test 
requests, compiling test instructions, 
communicating with students and 
constituents on campus, printing 
proctor materials, and packing bins. 
These tasks are split among the GAs, 
followed by a rigorous round of quality 
checks, some of which are completed 
by the faculty lead and the work-study 
students. Prior to Assessment Day, the 
administrative assistant reserves testing 
rooms, hires proctors, and coordinates 
meal services. The faculty lead is 
primarily responsible for coordinating 
work among the team members and 
ensuring that work is completed by the 
prespecified deadlines. 


During Assessment Day 

During Assessment Day the 
administrative assistant oversees the 
completion of paperwork for hiring 


° Specifically, we begin by acquiring the list of student identification numbers for all incoming first-year students and sort this list by the 
last three digits of the identification number. Starting with a value of 000, we assign three-digit values to rooms and sessions, starting 
with the first room and Session A. Once the number of students reaches the room size, we progress to the next room. After we have pro- 
gressed through all rooms for Session A in this manner, we repeat the process for Session B and then Session C. Starting in Fall 2018 we 
began assigning students based on the last four digits of their identification number (instead of three digits) to accommodate increases 


in the size of the student body. 


°At our university, the assessment office (CARS) and the Office of Institutional Research are separate and the latter does not assist with 
Assessment Days. In many universities, assessment falls under the purview of an institutional research office or a larger strategic plan- 
ning office. How feasible it is to implement the Assessment Day model in these different organizational configurations depends on the 
number of staff, size of the student body, and the scope of assessment (e.g., number of assessments, number of Assessment Days). 
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proctors, coordinates delivery of 
meals, and answers the phone in the 
room that serves as headquarters. The 
faculty lead welcomes the proctors 
and answers questions. Once proctors 
proceed to their designated rooms and 
students begin to arrive, two of the GAs 
act as runners who move throughout 
the testing rooms to help proctors set 
up. The third GA and the faculty lead 
remain in headquarters to respond 

to any other needs and to monitor 

the CARS email account for student 
questions. The CARS information 
security analyst is also present in 
headquarters to assist with technology 
issues. After the final testing session, 
the team collects materials, packs 

up the headquarters room, checks 

all testing rooms for any forgotten 
materials, and ensures rooms are left 
the way they were found. 


After Assessment Day 

After Assessment Day the GAs 
oversee the work-study students in 
the unpacking of all materials (e.g., 
Scantrons, tests, pencils, folders, bins, 
binders) and their inventory. The work— 
study students also check technology, 
such as Chromebooks (i.e., tablet-like 
laptops), to ensure that everything is 
in working order. In sum, the work- 
study students help us ensure that all 
materials are accounted for and ready 
for future use. 


Scanning and downloading of 

data is completed within a week of 
Assessment Day, thereby allowing 
the team to track attendance. 
Students who failed to attend (either 


for legitimate reasons or out of 
delinquency) have a hold placed on 
their record and are contacted via email 
about make-up sessions. There are 
typically two to six make-up sessions, 
each accommodating about 100 
students, scheduled in the evenings 
several weeks after Assessment Day. 
The GAs plan and proctor the make- 
up sessions, and the administrative 
assistant removes holds for students 
who attend. 


The management of all data also occurs 
within a month after Assessment 

Day and includes data scanning, 
downloading, cleaning, scoring, 

and formatting. Using the student 
identification numbers supplied by the 
student on each assessment, the data 
are also merged with other information 
needed for program assessment 
purposes; for instance, assessment 
scores for each student are merged 
with relevant course information. All 
GAs aid in data management and 
subsequent quality checks. Each 
program's assigned assessment liaison 
(with assistance from their own GAs) 
completes the analyses and report 
writing for each assessment within 

3 months of testing.’ Results are 
reported to the program faculty and 
staff, who may choose to disseminate 
the results more widely. Although it 
varies across programs, faculty and 
staff often meet to discuss the results 
and consider potential changes to their 
program. They are encouraged to use 

a learning improvement model, where 
assessment results obtained after 
program changes have been made are 


used to determine if the changes were 
effective (Fulcher, Good, Coleman, & 
Smith, 2014). 


Assessment Day Proctors 

Proctors are an important human 
resource that we greatly rely on. 
Although the number of proctors varies, 
our goal is to have one proctor for every 
30 students with no fewer than two 
proctors in a room, which results in about 
55 to 75 proctors. Proctor recruitment 
begins 2 months before Assessment Day 
when the team’s administrative assistant 
emails a job announcement and online 
application form to a list of potential 
proctors (including JMU graduate 
students, staff, and people who have 
previously served as proctors). We have 
many people in the local community 
who regularly proctor, many of whom 
are retired educators. From this referral- 
based network, completed applications 
are selected on a first-come, first-served 
basis. The application is closed once we 
have enough proctors, which typically 
occurs within 3 weeks. Proctors are paid a 
small stipend and are provided breakfast 
and lunch on Assessment Days. 


Because there are at least two proctors 
per room, it is important that proctors 
within a room act as a team. To facilitate 
cooperation, one proctor is assigned to 
be lead proctor; he or she acts as the 
spokesperson to the students, directs 
the testing session, and delegates tasks 
among other proctors. Both lead and 
non-lead proctors are responsible for 

a variety of other tasks. For instance, 
proctors are responsible for preparing 
the room for each session and 


’Care is taken in reporting so that the results can only be used to evaluate programs, not individual students or faculty members. 
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maintaining order (e.g., minimizing 
noise, disruptions, and inappropriate 
behaviors). Proctors also convey the 
importance of the assessments and 
create an environment that allows and 
encourages students to perform to 
the best of their ability. Thus, proctors 
have an important role in ensuring 
the quality of the data: they motivate 
students, ensure tests have been 
completed correctly, and report any 
noteworthy issues that could impact 
the results. How proctors are trained 
to accomplish these tasks is briefly 
described below and in more detail by 
Lau, Swerdzewski, Jones, Anderson, 
and Markle (2009). 


Changes Made to 

Assessment Day 

JMU’s Assessment Day model has 
evolved over time. Many changes 

have been made in response to 
increases in the size of the student 
body, developments in testing 
technology, and issues encountered 
after implementing an Assessment Day. 
Because our model has been in place 
for more than 30 years, it is impractical 
to describe all of the changes that 

have been made. We focus here on 
large changes that have improved the 
quality of data, saved money, improved 
efficiency, or reduced the environmental 
impact of Assessment Day. 


Number of Testing Sessions 

Perhaps the most significant change 
made in recent years is the transition 
from two 3-hour testing sessions 

to three 2-hour testing sessions. 

This change allowed the number of 
students tested to be distributed over 
three sessions instead of two, thereby 
requiring fewer rooms, proctors, and 
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testing materials. For instance, when 
the Spring 2015 administration, which 
used the three 2-hour testing session 
structure, was compared to the Spring 
2014 administration, which used the 
two 3-hour structure, substantial 
decreases were noted in the number of 
proctors (438%), Scantrons (45%), and 
copies of assessments (156%). Not only 
did this change reduce the amount of 
time required by any one student for 
testing, but it also greatly reduced costs 
as well as the environmental impact of 
Assessment Day. 


Assessment Day Video 

Beginning in 2014, we started to 

show students a 5-minute video 

at the beginning of each testing 
session; in this video the president 

of the university, general education 
faculty, and student actors explain 

the purpose of Assessment Day. The 
purpose of showing the video is two- 
fold: to increase student motivation 
and to standardize how information is 
communicated. By informing students 
how the data collected on Assessment 
Day are used to improve student 
learning on campus we hope to convey 
how completing the assessments to 
the best of their ability directly affects 
the quality of education at JMU as well 
as its reputation. Readers interested in 
viewing the video can find the link at 
JMU (2018, top of page). 


Proctor Selection and Training 

A few years ago, we made 
modifications to the way we recruit 
and hire proctors. We converted 

our proctor-hiring methods from an 
informal email process to a formal 
online application. Under the new 
hiring method, proctors complete an 


online application that allows us to 
collect necessary information before 
Assessment Day. The online application 
has also allowed us to ensure that 

our proctors are comfortable with 
technology. As we move to testing 
that is more computer-based, we 

need proctors who can navigate 
various types of technology with 

ease. By having proctors apply 
through an online form, we create a 
preliminary screening process for this 
skill. Additionally, we have started 
recruiting JMU graduate students 

to serve as proctors, which provides 
many benefits. Graduate students 

are generally familiar with and 
comfortable navigating JMU's campus 
and classroom technology and they 
usually have less hiring paperwork to 
process because they are often already 
JMU employees (e.g., GAs). The quality 
of proctors is somewhat controlled by 
our detailed job description and online 
application process. The Assessment 
Day team also observes proctors during 
Assessment Day and does not rehire 
proctors who perform poorly. 


Another notable change we have 
made to Assessment Day is to the 
timing and format of proctor training. 
At one time proctors were trained the 
morning of Assessment Day; however, 
the training session added an hour 

to an already long day and was often 
rushed. There was a lot of information 
packed into a quick presentation, 
leaving little time for proctors to reflect 
on the material and ask questions 
before being ushered into their 

rooms. To address these challenges, 

we moved the training online, which 
allows us to track which proctors have 
completed training and allows proctors 


to complete the training in their own 
space and time during the 2 weeks 
prior to Assessment Day. 


Ongoing Challenges 

Student Motivation 

The primary purpose of Assessment 
Day is to collect meaningful 
information about what students 
know, think, and can do. Our ability to 
make valid inferences from students’ 
scores relies on the quality of the data 
we collect. Unfortunately, the quality 
of the data is undermined when 
students are not motivated. Although 
we attempt to convey the purpose 
and importance of Assessment Day to 
students, the assessments are still low- 
stakes for students, and, as in any low- 
stakes assessment context, examinee 
motivation can suffer. 


Concerns about student motivation 
are mitigated somewhat by data 
indicating the majority of students 
think the assessments are important 
and try their best (e.g., see Sundre & 
Wise, 2003, Table 2). This is particularly 
true of incoming first-year students. 
However, because these findings 

do not characterize all students, we 
are continuously looking for ways to 
improve motivation. One strategy 
we use is to train our proctors to 

use motivational strategies as part 

of their role. We began intentionally 
training proctors in 2007 to use 
motivational strategies (e.g., conveying 
the importance of the test, being 
supportive yet firm, etc.) and found 
that students’ self-reported effort 

on the assessments was higher and 
less variable on Assessment Days 
that took place after this training was 
implemented (Lau et al., 2009). 


We have also studied the effects of 
providing different instructions to 
students (Finney, Sundre, Swain, & 
Williams, 2016). During this study 
students were randomly assigned to 
one of three sets of instructions: In 
Condition 1 we told students that their 
scores would be aggregated and used 
for institutional decision-making, in 
Condition 2 we expanded on Condition 
1 by telling students they would 

be able to receive their individual 
scores, and in Condition 3 we added 

to Conditions 1 and 2 by informing 
students that their individual scores 
would also be shared with faculty. Test 
performance from pretest to posttest 
along with test-taking motivation 
measures were not affected by the kind 
of instructions the student received. 


We have also piloted different 
assessment designs, such as a planned 
missingness design, to investigate 
whether giving students a portion 

of the assessment rather than whole 
assessment can improve motivation 
and performance (Swain, 2015). 
Although the effects were small, 
students completing only a portion 

of the assessment (about 33 items) 
performed better than students 
completing the whole assessment (66 
items). In addition, their motivation was 
more favorable, but not significantly so. 


As a final example, the use of electronic 
pop-up messages targeted at students 
displaying rapid responding behavior 
on computer-based tests has been 
investigated (Ong, Pastor, & Yang, 2018; 
Wise, Bhola, & Yang, 2006), with mixed 
results regarding the effectiveness 

of this intervention. In addition to 
changes aimed at improving student 


motivation, we are continuously 
researching different ways to measure 
motivation (e.g., self-report, item 
response time; Wise & Kong, 2005), 
assess its impact on the inferences we 
make (e.g., Finney et al., 2016), and 
accommodate the issue during our 
analyses (e.g., Foelber, 2017; Sundre & 
Wise, 2003). 


Efficiency 

Another important challenge we 
continue to face is the issue of 
efficiency. We have turned to electronic 
data collection as a primary way of 
reducing both our costs and our 
environmental impact. We consistently 
prioritize the use of on-campus 
computer labs to reduce both the 
number of paper tests and Scantrons 
needed. Furthermore, we recently 
incorporated the use of Chromebooks, 
which are tablet-like laptops, in rooms 
that were formerly used for paper- 
and-pencil testing. This allows us to 
assess around 200 students outside 
of a computer lab but still without 
resorting to Scantrons. We have also 
experimented with having students 
respond via handheld survey response 
tools on their smartphones (Sauder, 
Foelber, Jacovidis, & Pastor, 2016). 


With the emphasis on electronic data 
collection, the challenges we currently 
face are mostly physical limitations 
(e.g., number of available computer 
labs). A similar challenge is the lack of 
alternative technology for assessing 
students outside of the computer 

labs. The Chromebooks continue to 

be valuable in this regard, but we are 
limited by the number of Chromebooks 
we can purchase. Our experiments with 
handheld responding devices have 
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been challenged by considerations 

of cost and ease of use. Our pilot of 
student-owned smartphones had the 
advantage of being free for us, but 
brought its own challenges in terms 
of test security and student attention. 
Yet, we are optimistic about the future 
of technology in Assessment Day to 
increase our efficiency. 


CONCLUSION 


While the details can be dense, 

we hope they convey the thought 

and intentionality involved in our 
Assessment Day model. It is our 

hope that this information benefits 
institutions wanting to adopt an 
Assessment Day model for university- 
wide assessment. For institutions 
where our model may not be feasible 
or even desirable to implement ona 
large scale, aspects of our model can 
be adopted for assessment on a much 
smaller scale, even for the assessment 
of a single program. For institutions 
with Assessment Days already in place, 
we hope our description provides ideas 
for different ways to implement the 
model and alternative solutions for 
addressing its challenges. In sum, our 
intention is to share lessons learned 
and encourage readers to consider how 
our model might be adapted for their 
own purposes. 


Although we featured our Assessment 
Day model, we are open and 
supportive to any design that 
facilitates the collection of quality 
data. In addition, we encourage any 
institution with a quality process 

to share its approach and lessons 
learned with others. Let’s share quality 
practices to better answer the calls for 
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accountability and support legitimate 
learning improvement efforts. 
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